Search CORE

54 research outputs found

Spectral Perturbation and Reconstructability of Complex Networks

Author: B. Bollobas
D. Liu
H. Wang
N. Kambhatla
P. Van Mieghem
P. Van Mieghem
W. Weibull
Publication venue: 'American Physical Society (APS)'
Publication date: 06/01/2010
Field of study

In recent years, many network perturbation techniques, such as topological perturbations and service perturbations, were employed to study and improve the robustness of complex networks. However, there is no general way to evaluate the network robustness. In this paper, we propose a new global measure for a network, the reconstructability coefficient {\theta}, defined as the maximum number of eigenvalues that can be removed, subject to the condition that the adjacency matrix can be reconstructed exactly. Our main finding is that a linear scaling law, E[{\theta}]=aN, seems universal, in that it holds for all networks that we have studied.Comment: 9 pages, 10 figure

arXiv.org e-Print Archive

Crossref

TU Delft Repository

Improving Sparse Representation-Based Classification Using Local Principal Component Analysis

Author: A. Singer
A.S. Georghiades
Chia-Po Wei
Christian Merkwirth
Claudio Ceruti
David L. Donoho
Hakan Cevikalp
Hongzhi Zhang
J. Wright
Jadoon Waqas
Jon Louis Bentley
Jun Yin
L Qiao
N Kambhatla
Patrice Y. Simard
R Patel
ST Roweis
Xiaoyang Tan
Y LeCun
Yong Xu
Zechao Li
Publication venue
Publication date: 02/06/2018
Field of study

Sparse representation-based classification (SRC), proposed by Wright et al., seeks the sparsest decomposition of a test sample over the dictionary of training samples, with classification to the most-contributing class. Because it assumes test samples can be written as linear combinations of their same-class training samples, the success of SRC depends on the size and representativeness of the training set. Our proposed classification algorithm enlarges the training set by using local principal component analysis to approximate the basis vectors of the tangent hyperplane of the class manifold at each training sample. The dictionary in SRC is replaced by a local dictionary that adapts to the test sample and includes training samples and their corresponding tangent basis vectors. We use a synthetic data set and three face databases to demonstrate that this method can achieve higher classification accuracy than SRC in cases of sparse sampling, nonlinear class manifolds, and stringent dimension reduction.Comment: Published in "Computational Intelligence for Pattern Recognition," editors Shyi-Ming Chen and Witold Pedrycz. The original publication is available at http://www.springerlink.co

arXiv.org e-Print Archive

Crossref

Representing complex data using localized principal components with application to astronomical data

Author: A Gersho
A Gorban
AH Monaghan
AR Webb
B Chalmond
B Kégl
C Allende Prieto
CAL Bailer-Jones
CAL Bailer-Jones
DJ Marchette
E Diday
E Oja
EC Malthouse
EM Braverman
FL Hall
H Hotelling
H Späth
H Wold
IT Jolliffe
J Einbeck
J Einbeck
JH Friedman
JH Friedman
JH Friedman
JJ Verbeek
JM Chambers
K Fukunaga
K Hornik
L Breiman
MAC Perryman
MG Kendall
N Kambhatla
P Delicado
P Delicado
PG Willemsen
R Tibshirani
RJ Bolton
S de Jong
T Aluja-Banet
T Duchamps
T Hastie
T Hastie
WS Cleveland
Z-Y Liu
Publication venue
Publication date: 01/01/2007
Field of study

Often the relation between the variables constituting a multivariate data space might be characterized by one or more of the terms: ``nonlinear'', ``branched'', ``disconnected'', ``bended'', ``curved'', ``heterogeneous'', or, more general, ``complex''. In these cases, simple principal component analysis (PCA) as a tool for dimension reduction can fail badly. Of the many alternative approaches proposed so far, local approximations of PCA are among the most promising. This paper will give a short review of localized versions of PCA, focusing on local principal curves and local partitioning algorithms. Furthermore we discuss projections other than the local principal components. When performing local dimension reduction for regression or classification problems it is important to focus not only on the manifold structure of the covariates, but also on the response variable(s). Local principal components only achieve the former, whereas localized regression approaches concentrate on the latter. Local projection directions derived from the partial least squares (PLS) algorithm offer an interesting trade-off between these two objectives. We apply these methods to several real data sets. In particular, we consider simulated astrophysical data from the future Galactic survey mission Gaia.Comment: 25 pages. In "Principal Manifolds for Data Visualization and Dimension Reduction", A. Gorban, B. Kegl, D. Wunsch, and A. Zinovyev (eds), Lecture Notes in Computational Science and Engineering, Springer, 2007, pp. 180--204, http://www.springer.com/dal/home/generic/search/results?SGWID=1-40109-22-173750210-

arXiv.org e-Print Archive

Durham Research Online

Crossref

Enlighten

Explore Bristol Research

Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach

Author: A Airola
A Culotta
AP Manine
AR Aronson
BJ Grosz
C Jacquemin
C Nédellec
D Bollegala
D Field
D Zelenko
G Erkan
I Segura-Bedmar
JD Kim
JO Korbel
K Fundel
K Liolios
M Torii
N Kambhatla
Pierre Warnier
R Bossy
R Bossy
S Aubin
S Lappin
SA Kripke
SP Lapage
T Hamon
T Ono
Wiktoria Golik
Y Lin
Z GuoDong
Zorana Ratkovic
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

International audienceBackground: Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field

Crossref

Springer - Publisher Connector

PubMed Central

HAL Descartes

Hal-Diderot

Aggregating semantic annotators

Author: Aslam J. A.
Bender O.
Bertossi L.
Carreira R.
Cimiano P.
Ciravegna F.
Dalvi N.
Dong X. L.
Duong D.
Etzioni O.
Euzenat J.
Flesca S.
Florian R.
Galland A.
Grossi D.
H. Cunningham
Hartmann S.
Kakade S.
Kambhatla N.
Kiryakov A.
Kozareva Z.
McCallum A.
Michelakis E.
Ratinov L.
Rizzo G.
Rosati R.
Senellart P.
Si L.
Suchanek F. M.
van Erp M.
Wang H.
Wu D.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

Manifold Learning for Human Population Structure Studies

Author: A Chakravarti
AB Lee
AB Lee
AJ Izenman
AL Price
B Li
BM Henn
C Deng
DR Bentley
E Kosman
F Collins
GV Kryukov
Hoicheong Siu
J Friedman
J Novembre
J Shendure
J Tenenbaum
J Zhang
J Zhang
J Zhang
JC Venter
JE Pool
L Cavalli-Sforza
Li Jin
LK Saul
M Belkin
ML Metzker
Momiao Xiong
N Kambhatla
N Patterson
P Menozzi
P Paschou
PE Smouse
R Drmanac
R Nielsen
R Wang
S Biswas
S Roweis
S Yan
SY Kim
T Tibshirani
W Guan
W Zhang
Yun Li
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

The dimension of the population genetics data produced by next-generation sequencing platforms is extremely high. However, the “intrinsic dimensionality” of sequence data, which determines the structure of populations, is much lower. This motivates us to use locally linear embedding (LLE) which projects high dimensional genomic data into low dimensional, neighborhood preserving embedding, as a general framework for population structure and historical inference. To facilitate application of the LLE to population genetic analysis, we systematically investigate several important properties of the LLE and reveal the connection between the LLE and principal component analysis (PCA). Identifying a set of markers and genomic regions which could be used for population structure analysis will provide invaluable information for population genetics and association studies. In addition to identifying the LLE-correlated or PCA-correlated structure informative marker, we have developed a new statistic that integrates genomic information content in a genomic region for collectively studying its association with the population structure and LASSO algorithm to search such regions across the genomes. We applied the developed methodologies to a low coverage pilot dataset in the 1000 Genomes Project and a PHASE III Mexico dataset of the HapMap. We observed that 25.1%, 44.9% and 21.4% of the common variants and 89.2%, 92.4% and 75.1% of the rare variants were the LLE-correlated markers in CEU, YRI and ASI, respectively. This showed that rare variants, which are often private to specific populations, have much higher power to identify population substructure than common variants. The preliminary results demonstrated that next generation sequencing offers a rich resources and LLE provide a powerful tool for population structure analysis

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

HypertenGene: extracting key hypertension genes from biomedical literature with position and automatically-generated template features

Author: A Rzhetsky
AK Ramani
B Rosario
C Blaschke
Chi-Hsin Huang
F Sha
H-W Chun
Hong-Jie Dai
HW Chun
J Lafferty
J Nocedal
J Xiao
JN Darroch
K Becker
K Hirohata
M Bundschus
M Craven
M Masseroli
M Shimbo
N Kambhatla
P Ruch
Po-Ting Lai
R Bunescu
R Weissberg
RC Bunescu
Richard Tzong-Han Tsai
RT Tsai
RTK Lin
T Ono
T Rindflesch
TC Rindflesch
TF Smith
TH Tsai
Wen-Harn Pan
Wen-Lian Hsu
Y Yamamoto
Yen-Ching Chang
Yue-Yang Bow
Z GuoDong
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The genetic factors leading to hypertension have been extensively studied, and large numbers of research papers have been published on the subject. One of hypertension researchers' primary research tasks is to locate key hypertension-related genes in abstracts. However, gathering such information with existing tools is not easy: (1) Searching for articles often returns far too many hits to browse through. (2) The search results do not highlight the hypertension-related genes discovered in the abstract. (3) Even though some text mining services mark up gene names in the abstract, the key genes investigated in a paper are still not distinguished from other genes. To facilitate the information gathering process for hypertension researchers, one solution would be to extract the key hypertension-related genes in each abstract. Three major tasks are involved in the construction of this system: (1) gene and hypertension named entity recognition, (2) section categorization, and (3) gene-hypertension relation extraction. Results We first compare the retrieval performance achieved by individually adding template features and position features to the baseline system. Then, the combination of both is examined. We found that using position features can almost double the original AUC score (0.8140vs.0.4936) of the baseline system. However, adding template features only results in marginal improvement (0.0197). Including both improves AUC to 0.8184, indicating that these two sets of features are complementary, and do not have overlapping effects. We then examine the performance in a different domain--diabetes, and the result shows a satisfactory AUC of 0.83. Conclusion Our approach successfully exploits template features to recognize true hypertension-related gene mentions and position features to distinguish key genes from other related genes. Templates are automatically generated and checked by biologists to minimize labor costs. Our approach integrates the advantages of machine learning models and pattern matching. To the best of our knowledge, this the first systematic study of extracting hypertension-related genes and the first attempt to create a hypertension-gene relation corpus based on the GAD database. Furthermore, our paper proposes and tests novel features for extracting key hypertension genes, such as relative position, section, and template features, which could also be applied to key-gene extraction for other diseases.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Optimized cross-layer forward error correction coding for H.264 AVC video transmission over wireless channels

Author: A Alexiou
A Bouabdallah
A Shokrollahi
A Talari
Ali Talari
AS Tan
C Bouras
C Hellge
D Coley
D Gomez-Barquero
D Munaretto
D Vukobratovic
D Vukobratovic
E Baccaglini
E Maani
E Setton
H Ha
H Jenkac
H Kushwaha
J Afzal
J Hagenauer
J Holland
John D Matyjas
JR Koza
K Deb
K Kambhatla
M Luby
M Luby
M van der Schaar
MathWorks
N Rahnavard
N Rahnavard
N Thomos
Nazanin Rahnavard
P Cataldi
P Koopman
R Gallager
S Ahmad
S Ahmad
S Argyropoulos
S Kumar
S Kumar
S Paluri
S Yingbo Shi
Seethal Paluri
Sunil Kumar
T Courtade
T Gasiba
T Stockhammer
T Stockhammer
T Stockhammer
T Wiegand
T Wiegand
W Xiang
XJ Zhang
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Forward error correction (FEC) codes that can provide unequal error protection (UEP) have been used recently for video transmission over wireless channels. These video transmission schemes may also benefit from the use of FEC codes both at the application layer (AL) and the physical layer (PL). However, the interaction and optimal setup of UEP FEC codes at the AL and the PL have not been previously investigated. In this paper, we study the cross-layer design of FEC codes at both layers for H.264 video transmission over wireless channels. In our scheme, UEP Luby transform codes are employed at the AL and rate-compatible punctured convolutional codes at the PL. In the proposed scheme, video slices are first prioritized based on their contribution to video quality. Next, we investigate the four combinations of cross-layer FEC schemes at both layers and concurrently optimize their parameters to minimize the video distortion and maximize the peak signal-to-noise ratio. We evaluate the performance of these schemes on four test H.264 video streams and show the superiority of optimized cross-layer FEC design.Peer reviewedElectrical and Computer Engineerin

Crossref

Springer - Publisher Connector

SHAREOK repository

Sieve-based relation extraction of gene regulatory networks from biological literature

Author: A Franceschini
A Koike
A Mitchell
A Yates
AP Davis
BA Traag
Blaž Zupan
C Giuliano
C Nédellec
CH Wei
D Freitag
D Higgins
H Lee
H Liu
H Polen
HM Müller
IS Peter
J Amberger
J Björne
J Errington
J Kim
J Makhoul
JD Kim
JD Lafferty
JD Osborne
K Hakala
LA Ramshaw
LT MacNeil
M Ashburner
M Banko
M Bansal
M Garcia
M Krallinger
M Kwak
M Schmalisch
Marinka Žitnik
Marko Bajec
N Kambhatla
R Bossy
RC Bunescu
RM Piro
S Brin
S Pyysalo
S Pyysalo
S Sarawagi
S Van Landeghem
S Zitnik
S Žitnik
Slavko Žitnik
T Cohn
T Provoost
T Sauka-Spengler
T Wang
V Claveau
Y Li
Y Moreau
Y Xu
Z Xiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Identifying and tracking entity mentions in a maximum entropy framework

Author: A. Ittycheriah
L. Lita
M. Stys
N. Kambhatla
N. Nicolov
S. Roukos
Publication venue
Publication date: 01/01/2003
Field of study

abei,nanda,nicolas,roukos,sm1¢ We present a system for identifying and tracking named, nominal, and pronominal mentions of entities within a text document. Our maximum entropy model for mention detection combines two pre-existing named entity taggers (built to extract different entity categories), and other syntactic and morphological feature streams to achieve competitive performance. We developed a novel maximum entropy model for tracking all mentions of an entity within a document. We participated in the Automatic Content Extraction (ACE) evaluation and performed well. We describe our system and present results of the ACE evaluation.

CiteSeerX

Crossref